This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ───────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0 ✓ purrr 0.3.3
## ✓ tibble 2.1.3 ✓ stringr 1.4.0
## ✓ tidyr 1.0.2 ✓ forcats 0.5.0
## ✓ readr 1.3.1
## ── Conflicts ──────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
library(missForest)
## Loading required package: randomForest
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:ggplot2':
##
## margin
## The following object is masked from 'package:dplyr':
##
## combine
## Loading required package: foreach
##
## Attaching package: 'foreach'
## The following objects are masked from 'package:purrr':
##
## accumulate, when
## Loading required package: itertools
## Loading required package: iterators
library(mice)
##
## Attaching package: 'mice'
## The following objects are masked from 'package:base':
##
## cbind, rbind
library(arm)
## Loading required package: MASS
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
## Loading required package: lme4
##
## arm (Version 1.10-1, built: 2018-4-12)
## Working directory is /Users/Yen/Documents/master degree/MIS/Spring 2020/STATISTICAL AND PREDICTIVE ANALYTICS/IS 6489 Group Project
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(moments)
You can also embed plots, for example:
## Id MSSubClass MSZoning LotFrontage
## Min. : 1.0 Min. : 20.0 C (all): 10 Min. : 21.00
## 1st Qu.: 365.8 1st Qu.: 20.0 FV : 65 1st Qu.: 59.00
## Median : 730.5 Median : 50.0 RH : 16 Median : 69.00
## Mean : 730.5 Mean : 56.9 RL :1151 Mean : 70.05
## 3rd Qu.:1095.2 3rd Qu.: 70.0 RM : 218 3rd Qu.: 80.00
## Max. :1460.0 Max. :190.0 Max. :313.00
## NA's :259
## LotArea Street Alley LotShape LandContour Utilities
## Min. : 1300 Grvl: 6 Grvl: 50 IR1:484 Bnk: 63 AllPub:1459
## 1st Qu.: 7554 Pave:1454 Pave: 41 IR2: 41 HLS: 50 NoSeWa: 1
## Median : 9478 NA's:1369 IR3: 10 Low: 36
## Mean : 10517 Reg:925 Lvl:1311
## 3rd Qu.: 11602
## Max. :215245
##
## LotConfig LandSlope Neighborhood Condition1 Condition2
## Corner : 263 Gtl:1382 NAmes :225 Norm :1260 Norm :1445
## CulDSac: 94 Mod: 65 CollgCr:150 Feedr : 81 Feedr : 6
## FR2 : 47 Sev: 13 OldTown:113 Artery : 48 Artery : 2
## FR3 : 4 Edwards:100 RRAn : 26 PosN : 2
## Inside :1052 Somerst: 86 PosN : 19 RRNn : 2
## Gilbert: 79 RRAe : 11 PosA : 1
## (Other):707 (Other): 15 (Other): 2
## BldgType HouseStyle OverallQual OverallCond YearBuilt
## 1Fam :1220 1Story :726 Min. : 1.000 Min. :1.000 Min. :1872
## 2fmCon: 31 2Story :445 1st Qu.: 5.000 1st Qu.:5.000 1st Qu.:1954
## Duplex: 52 1.5Fin :154 Median : 6.000 Median :5.000 Median :1973
## Twnhs : 43 SLvl : 65 Mean : 6.099 Mean :5.575 Mean :1971
## TwnhsE: 114 SFoyer : 37 3rd Qu.: 7.000 3rd Qu.:6.000 3rd Qu.:2000
## 1.5Unf : 14 Max. :10.000 Max. :9.000 Max. :2010
## (Other): 19
## YearRemodAdd RoofStyle RoofMatl Exterior1st Exterior2nd
## Min. :1950 Flat : 13 CompShg:1434 VinylSd:515 VinylSd:504
## 1st Qu.:1967 Gable :1141 Tar&Grv: 11 HdBoard:222 MetalSd:214
## Median :1994 Gambrel: 11 WdShngl: 6 MetalSd:220 HdBoard:207
## Mean :1985 Hip : 286 WdShake: 5 Wd Sdng:206 Wd Sdng:197
## 3rd Qu.:2004 Mansard: 7 ClyTile: 1 Plywood:108 Plywood:142
## Max. :2010 Shed : 2 Membran: 1 CemntBd: 61 CmentBd: 60
## (Other): 2 (Other):128 (Other):136
## MasVnrType MasVnrArea ExterQual ExterCond Foundation BsmtQual
## BrkCmn : 15 Min. : 0.0 Ex: 52 Ex: 3 BrkTil:146 Ex :121
## BrkFace:445 1st Qu.: 0.0 Fa: 14 Fa: 28 CBlock:634 Fa : 35
## None :864 Median : 0.0 Gd:488 Gd: 146 PConc :647 Gd :618
## Stone :128 Mean : 103.7 TA:906 Po: 1 Slab : 24 TA :649
## NA's : 8 3rd Qu.: 166.0 TA:1282 Stone : 6 NA's: 37
## Max. :1600.0 Wood : 3
## NA's :8
## BsmtCond BsmtExposure BsmtFinType1 BsmtFinSF1 BsmtFinType2
## Fa : 45 Av :221 ALQ :220 Min. : 0.0 ALQ : 19
## Gd : 65 Gd :134 BLQ :148 1st Qu.: 0.0 BLQ : 33
## Po : 2 Mn :114 GLQ :418 Median : 383.5 GLQ : 14
## TA :1311 No :953 LwQ : 74 Mean : 443.6 LwQ : 46
## NA's: 37 NA's: 38 Rec :133 3rd Qu.: 712.2 Rec : 54
## Unf :430 Max. :5644.0 Unf :1256
## NA's: 37 NA's: 38
## BsmtFinSF2 BsmtUnfSF TotalBsmtSF Heating HeatingQC
## Min. : 0.00 Min. : 0.0 Min. : 0.0 Floor: 1 Ex:741
## 1st Qu.: 0.00 1st Qu.: 223.0 1st Qu.: 795.8 GasA :1428 Fa: 49
## Median : 0.00 Median : 477.5 Median : 991.5 GasW : 18 Gd:241
## Mean : 46.55 Mean : 567.2 Mean :1057.4 Grav : 7 Po: 1
## 3rd Qu.: 0.00 3rd Qu.: 808.0 3rd Qu.:1298.2 OthW : 2 TA:428
## Max. :1474.00 Max. :2336.0 Max. :6110.0 Wall : 4
##
## CentralAir Electrical X1stFlrSF X2ndFlrSF LowQualFinSF
## N: 95 FuseA: 94 Min. : 334 Min. : 0 Min. : 0.000
## Y:1365 FuseF: 27 1st Qu.: 882 1st Qu.: 0 1st Qu.: 0.000
## FuseP: 3 Median :1087 Median : 0 Median : 0.000
## Mix : 1 Mean :1163 Mean : 347 Mean : 5.845
## SBrkr:1334 3rd Qu.:1391 3rd Qu.: 728 3rd Qu.: 0.000
## NA's : 1 Max. :4692 Max. :2065 Max. :572.000
##
## GrLivArea BsmtFullBath BsmtHalfBath FullBath
## Min. : 334 Min. :0.0000 Min. :0.00000 Min. :0.000
## 1st Qu.:1130 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:1.000
## Median :1464 Median :0.0000 Median :0.00000 Median :2.000
## Mean :1515 Mean :0.4253 Mean :0.05753 Mean :1.565
## 3rd Qu.:1777 3rd Qu.:1.0000 3rd Qu.:0.00000 3rd Qu.:2.000
## Max. :5642 Max. :3.0000 Max. :2.00000 Max. :3.000
##
## HalfBath BedroomAbvGr KitchenAbvGr KitchenQual TotRmsAbvGrd
## Min. :0.0000 Min. :0.000 Min. :0.000 Ex:100 Min. : 2.000
## 1st Qu.:0.0000 1st Qu.:2.000 1st Qu.:1.000 Fa: 39 1st Qu.: 5.000
## Median :0.0000 Median :3.000 Median :1.000 Gd:586 Median : 6.000
## Mean :0.3829 Mean :2.866 Mean :1.047 TA:735 Mean : 6.518
## 3rd Qu.:1.0000 3rd Qu.:3.000 3rd Qu.:1.000 3rd Qu.: 7.000
## Max. :2.0000 Max. :8.000 Max. :3.000 Max. :14.000
##
## Functional Fireplaces FireplaceQu GarageType GarageYrBlt
## Maj1: 14 Min. :0.000 Ex : 24 2Types : 6 Min. :1900
## Maj2: 5 1st Qu.:0.000 Fa : 33 Attchd :870 1st Qu.:1961
## Min1: 31 Median :1.000 Gd :380 Basment: 19 Median :1980
## Min2: 34 Mean :0.613 Po : 20 BuiltIn: 88 Mean :1979
## Mod : 15 3rd Qu.:1.000 TA :313 CarPort: 9 3rd Qu.:2002
## Sev : 1 Max. :3.000 NA's:690 Detchd :387 Max. :2010
## Typ :1360 NA's : 81 NA's :81
## GarageFinish GarageCars GarageArea GarageQual GarageCond
## Fin :352 Min. :0.000 Min. : 0.0 Ex : 3 Ex : 2
## RFn :422 1st Qu.:1.000 1st Qu.: 334.5 Fa : 48 Fa : 35
## Unf :605 Median :2.000 Median : 480.0 Gd : 14 Gd : 9
## NA's: 81 Mean :1.767 Mean : 473.0 Po : 3 Po : 7
## 3rd Qu.:2.000 3rd Qu.: 576.0 TA :1311 TA :1326
## Max. :4.000 Max. :1418.0 NA's: 81 NA's: 81
##
## PavedDrive WoodDeckSF OpenPorchSF EnclosedPorch X3SsnPorch
## N: 90 Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## P: 30 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 0.00
## Y:1340 Median : 0.00 Median : 25.00 Median : 0.00 Median : 0.00
## Mean : 94.24 Mean : 46.66 Mean : 21.95 Mean : 3.41
## 3rd Qu.:168.00 3rd Qu.: 68.00 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :857.00 Max. :547.00 Max. :552.00 Max. :508.00
##
## ScreenPorch PoolArea PoolQC Fence MiscFeature
## Min. : 0.00 Min. : 0.000 Ex : 2 GdPrv: 59 Gar2: 2
## 1st Qu.: 0.00 1st Qu.: 0.000 Fa : 2 GdWo : 54 Othr: 2
## Median : 0.00 Median : 0.000 Gd : 3 MnPrv: 157 Shed: 49
## Mean : 15.06 Mean : 2.759 NA's:1453 MnWw : 11 TenC: 1
## 3rd Qu.: 0.00 3rd Qu.: 0.000 NA's :1179 NA's:1406
## Max. :480.00 Max. :738.000
##
## MiscVal MoSold YrSold SaleType
## Min. : 0.00 Min. : 1.000 Min. :2006 WD :1267
## 1st Qu.: 0.00 1st Qu.: 5.000 1st Qu.:2007 New : 122
## Median : 0.00 Median : 6.000 Median :2008 COD : 43
## Mean : 43.49 Mean : 6.322 Mean :2008 ConLD : 9
## 3rd Qu.: 0.00 3rd Qu.: 8.000 3rd Qu.:2009 ConLI : 5
## Max. :15500.00 Max. :12.000 Max. :2010 ConLw : 5
## (Other): 9
## SaleCondition SalePrice
## Abnorml: 101 Min. : 34900
## AdjLand: 4 1st Qu.:129975
## Alloca : 12 Median :163000
## Family : 20 Mean :180921
## Normal :1198 3rd Qu.:214000
## Partial: 125 Max. :755000
##
Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.
#Clean Train Data
#add new factor level.
train$Alley = factor(train$Alley, levels=c(levels(train$Alley), "No Alley Access"))
train$BsmtQual = factor(train$BsmtQual, levels=c(levels(train$BsmtQual), "No Bsmt"))
train$BsmtCond = factor(train$BsmtCond, levels=c(levels(train$BsmtCond), "No Bsmt"))
train$BsmtExposure = factor(train$BsmtExposure, levels=c(levels(train$BsmtExposure), "No Bsmt"))
train$BsmtFinType1 = factor(train$BsmtFinType1, levels=c(levels(train$BsmtFinType1), "No Bsmt"))
train$BsmtFinType2 = factor(train$BsmtFinType2, levels=c(levels(train$BsmtFinType2), "No Bsmt"))
train$FireplaceQu = factor(train$FireplaceQu, levels=c(levels(train$FireplaceQu), "No Fireplace"))
train$GarageType = factor(train$GarageType, levels=c(levels(train$GarageType), "No Garage"))
train$GarageFinish = factor(train$GarageFinish, levels=c(levels(train$GarageFinish), "No Garage"))
train$GarageQual = factor(train$GarageQual, levels=c(levels(train$GarageQual), "No Garage"))
train$GarageCond = factor(train$GarageCond, levels=c(levels(train$GarageCond), "No Garage"))
train$PoolQC= factor(train$PoolQC, levels=c(levels(train$PoolQC), "No Pool"))
train$Fence= factor(train$Fence, levels=c(levels(train$Fence), "No Fence"))
train$MiscFeature= factor(train$MiscFeature, levels=c(levels(train$MiscFeature), "None"))
#convert all NA's to values
train$Alley[is.na(train$Alley)] = "No Alley Access"
train$BsmtQual[is.na(train$BsmtQual)] = "No Bsmt"
train$BsmtCond[is.na(train$BsmtCond)] = "No Bsmt"
train$BsmtExposure[is.na(train$BsmtExposure)] = "No Bsmt"
train$BsmtFinType1[is.na(train$BsmtFinType1)] = "No Bsmt"
train$BsmtFinType2[is.na(train$BsmtFinType2)] = "No Bsmt"
train$FireplaceQu[is.na(train$FireplaceQu)] = "No Fireplace"
train$GarageType[is.na(train$GarageType)] = "No Garage"
train$GarageFinish[is.na(train$GarageFinish)] = "No Garage"
train$GarageQual[is.na(train$GarageQual)] = "No Garage"
train$GarageCond[is.na(train$GarageCond)] = "No Garage"
train$PoolQC[is.na(train$PoolQC)] = "No Pool"
train$Fence[is.na(train$Fence)] = "No Fence"
train$MiscFeature[is.na(train$MiscFeature)] = "None"
#YearBuilt, change to age, remove the original column
train <- train %>%
dplyr::mutate(Age = 2020 - YearBuilt) %>%
dplyr::select(-YearBuilt)
#GarageYrBlt change to GarageAge
train <- train %>%
dplyr::mutate(GarageAge = 2020 - GarageYrBlt) %>%
dplyr::select(-GarageYrBlt)
#YearRemodAdd, change to difference between 2019 and YearRemodAdd
train <- train %>%
dplyr::mutate(YearRemodAdd = 2020 - YearRemodAdd)
#load test data
test <- read.csv('test.csv')
summary(test)
## Id MSSubClass MSZoning LotFrontage
## Min. :1461 Min. : 20.00 C (all): 15 Min. : 21.00
## 1st Qu.:1826 1st Qu.: 20.00 FV : 74 1st Qu.: 58.00
## Median :2190 Median : 50.00 RH : 10 Median : 67.00
## Mean :2190 Mean : 57.38 RL :1114 Mean : 68.58
## 3rd Qu.:2554 3rd Qu.: 70.00 RM : 242 3rd Qu.: 80.00
## Max. :2919 Max. :190.00 NA's : 4 Max. :200.00
## NA's :227
## LotArea Street Alley LotShape LandContour Utilities
## Min. : 1470 Grvl: 6 Grvl: 70 IR1:484 Bnk: 54 AllPub:1457
## 1st Qu.: 7391 Pave:1453 Pave: 37 IR2: 35 HLS: 70 NA's : 2
## Median : 9399 NA's:1352 IR3: 6 Low: 24
## Mean : 9819 Reg:934 Lvl:1311
## 3rd Qu.:11518
## Max. :56600
##
## LotConfig LandSlope Neighborhood Condition1 Condition2
## Corner : 248 Gtl:1396 NAmes :218 Norm :1251 Artery: 3
## CulDSac: 82 Mod: 60 OldTown:126 Feedr : 83 Feedr : 7
## FR2 : 38 Sev: 3 CollgCr:117 Artery : 44 Norm :1444
## FR3 : 10 Somerst: 96 RRAn : 24 PosA : 3
## Inside :1081 Edwards: 94 PosN : 20 PosN : 2
## NridgHt: 89 RRAe : 17
## (Other):719 (Other): 20
## BldgType HouseStyle OverallQual OverallCond YearBuilt
## 1Fam :1205 1.5Fin:160 Min. : 1.000 Min. :1.000 Min. :1879
## 2fmCon: 31 1.5Unf: 5 1st Qu.: 5.000 1st Qu.:5.000 1st Qu.:1953
## Duplex: 57 1Story:745 Median : 6.000 Median :5.000 Median :1973
## Twnhs : 53 2.5Unf: 13 Mean : 6.079 Mean :5.554 Mean :1971
## TwnhsE: 113 2Story:427 3rd Qu.: 7.000 3rd Qu.:6.000 3rd Qu.:2001
## SFoyer: 46 Max. :10.000 Max. :9.000 Max. :2010
## SLvl : 63
## YearRemodAdd RoofStyle RoofMatl Exterior1st Exterior2nd
## Min. :1950 Flat : 7 CompShg:1442 VinylSd:510 VinylSd:510
## 1st Qu.:1963 Gable :1169 Tar&Grv: 12 MetalSd:230 MetalSd:233
## Median :1992 Gambrel: 11 WdShake: 4 HdBoard:220 HdBoard:199
## Mean :1984 Hip : 265 WdShngl: 1 Wd Sdng:205 Wd Sdng:194
## 3rd Qu.:2004 Mansard: 4 Plywood:113 Plywood:128
## Max. :2010 Shed : 3 (Other):180 (Other):194
## NA's : 1 NA's : 1
## MasVnrType MasVnrArea ExterQual ExterCond Foundation BsmtQual
## BrkCmn : 10 Min. : 0.0 Ex: 55 Ex: 9 BrkTil:165 Ex :137
## BrkFace:434 1st Qu.: 0.0 Fa: 21 Fa: 39 CBlock:601 Fa : 53
## None :878 Median : 0.0 Gd:491 Gd: 153 PConc :661 Gd :591
## Stone :121 Mean : 100.7 TA:892 Po: 2 Slab : 25 TA :634
## NA's : 16 3rd Qu.: 164.0 TA:1256 Stone : 5 NA's: 44
## Max. :1290.0 Wood : 2
## NA's :15
## BsmtCond BsmtExposure BsmtFinType1 BsmtFinSF1 BsmtFinType2
## Fa : 59 Av :197 ALQ :209 Min. : 0.0 ALQ : 33
## Gd : 57 Gd :142 BLQ :121 1st Qu.: 0.0 BLQ : 35
## Po : 3 Mn :125 GLQ :431 Median : 350.5 GLQ : 20
## TA :1295 No :951 LwQ : 80 Mean : 439.2 LwQ : 41
## NA's: 45 NA's: 44 Rec :155 3rd Qu.: 753.5 Rec : 51
## Unf :421 Max. :4010.0 Unf :1237
## NA's: 42 NA's :1 NA's: 42
## BsmtFinSF2 BsmtUnfSF TotalBsmtSF Heating HeatingQC
## Min. : 0.00 Min. : 0.0 Min. : 0 GasA:1446 Ex:752
## 1st Qu.: 0.00 1st Qu.: 219.2 1st Qu.: 784 GasW: 9 Fa: 43
## Median : 0.00 Median : 460.0 Median : 988 Grav: 2 Gd:233
## Mean : 52.62 Mean : 554.3 Mean :1046 Wall: 2 Po: 2
## 3rd Qu.: 0.00 3rd Qu.: 797.8 3rd Qu.:1305 TA:429
## Max. :1526.00 Max. :2140.0 Max. :5095
## NA's :1 NA's :1 NA's :1
## CentralAir Electrical X1stFlrSF X2ndFlrSF LowQualFinSF
## N: 101 FuseA: 94 Min. : 407.0 Min. : 0 Min. : 0.000
## Y:1358 FuseF: 23 1st Qu.: 873.5 1st Qu.: 0 1st Qu.: 0.000
## FuseP: 5 Median :1079.0 Median : 0 Median : 0.000
## SBrkr:1337 Mean :1156.5 Mean : 326 Mean : 3.543
## 3rd Qu.:1382.5 3rd Qu.: 676 3rd Qu.: 0.000
## Max. :5095.0 Max. :1862 Max. :1064.000
##
## GrLivArea BsmtFullBath BsmtHalfBath FullBath
## Min. : 407 Min. :0.0000 Min. :0.0000 Min. :0.000
## 1st Qu.:1118 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:1.000
## Median :1432 Median :0.0000 Median :0.0000 Median :2.000
## Mean :1486 Mean :0.4345 Mean :0.0652 Mean :1.571
## 3rd Qu.:1721 3rd Qu.:1.0000 3rd Qu.:0.0000 3rd Qu.:2.000
## Max. :5095 Max. :3.0000 Max. :2.0000 Max. :4.000
## NA's :2 NA's :2
## HalfBath BedroomAbvGr KitchenAbvGr KitchenQual TotRmsAbvGrd
## Min. :0.0000 Min. :0.000 Min. :0.000 Ex :105 Min. : 3.000
## 1st Qu.:0.0000 1st Qu.:2.000 1st Qu.:1.000 Fa : 31 1st Qu.: 5.000
## Median :0.0000 Median :3.000 Median :1.000 Gd :565 Median : 6.000
## Mean :0.3777 Mean :2.854 Mean :1.042 TA :757 Mean : 6.385
## 3rd Qu.:1.0000 3rd Qu.:3.000 3rd Qu.:1.000 NA's: 1 3rd Qu.: 7.000
## Max. :2.0000 Max. :6.000 Max. :2.000 Max. :15.000
##
## Functional Fireplaces FireplaceQu GarageType GarageYrBlt
## Typ :1357 Min. :0.0000 Ex : 19 2Types : 17 Min. :1895
## Min2 : 36 1st Qu.:0.0000 Fa : 41 Attchd :853 1st Qu.:1959
## Min1 : 34 Median :0.0000 Gd :364 Basment: 17 Median :1979
## Mod : 20 Mean :0.5812 Po : 26 BuiltIn: 98 Mean :1978
## Maj1 : 5 3rd Qu.:1.0000 TA :279 CarPort: 6 3rd Qu.:2002
## (Other): 5 Max. :4.0000 NA's:730 Detchd :392 Max. :2207
## NA's : 2 NA's : 76 NA's :78
## GarageFinish GarageCars GarageArea GarageQual GarageCond
## Fin :367 Min. :0.000 Min. : 0.0 Fa : 76 Ex : 1
## RFn :389 1st Qu.:1.000 1st Qu.: 318.0 Gd : 10 Fa : 39
## Unf :625 Median :2.000 Median : 480.0 Po : 2 Gd : 6
## NA's: 78 Mean :1.766 Mean : 472.8 TA :1293 Po : 7
## 3rd Qu.:2.000 3rd Qu.: 576.0 NA's: 78 TA :1328
## Max. :5.000 Max. :1488.0 NA's: 78
## NA's :1 NA's :1
## PavedDrive WoodDeckSF OpenPorchSF EnclosedPorch
## N: 126 Min. : 0.00 Min. : 0.00 Min. : 0.00
## P: 32 1st Qu.: 0.00 1st Qu.: 0.00 1st Qu.: 0.00
## Y:1301 Median : 0.00 Median : 28.00 Median : 0.00
## Mean : 93.17 Mean : 48.31 Mean : 24.24
## 3rd Qu.: 168.00 3rd Qu.: 72.00 3rd Qu.: 0.00
## Max. :1424.00 Max. :742.00 Max. :1012.00
##
## X3SsnPorch ScreenPorch PoolArea PoolQC Fence
## Min. : 0.000 Min. : 0.00 Min. : 0.000 Ex : 2 GdPrv: 59
## 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 0.000 Gd : 1 GdWo : 58
## Median : 0.000 Median : 0.00 Median : 0.000 NA's:1456 MnPrv: 172
## Mean : 1.794 Mean : 17.06 Mean : 1.744 MnWw : 1
## 3rd Qu.: 0.000 3rd Qu.: 0.00 3rd Qu.: 0.000 NA's :1169
## Max. :360.000 Max. :576.00 Max. :800.000
##
## MiscFeature MiscVal MoSold YrSold SaleType
## Gar2: 3 Min. : 0.00 Min. : 1.000 Min. :2006 WD :1258
## Othr: 2 1st Qu.: 0.00 1st Qu.: 4.000 1st Qu.:2007 New : 117
## Shed: 46 Median : 0.00 Median : 6.000 Median :2008 COD : 44
## NA's:1408 Mean : 58.17 Mean : 6.104 Mean :2008 ConLD : 17
## 3rd Qu.: 0.00 3rd Qu.: 8.000 3rd Qu.:2009 CWD : 8
## Max. :17000.00 Max. :12.000 Max. :2010 (Other): 14
## NA's : 1
## SaleCondition
## Abnorml: 89
## AdjLand: 8
## Alloca : 12
## Family : 26
## Normal :1204
## Partial: 120
##
#Clean Test Data
#add new factor level.
test$Alley = factor(test$Alley, levels=c(levels(test$Alley), "No Alley Access"))
test$BsmtQual = factor(test$BsmtQual, levels=c(levels(test$BsmtQual), "No Bsmt"))
test$BsmtCond = factor(test$BsmtCond, levels=c(levels(test$BsmtCond), "No Bsmt"))
test$BsmtExposure = factor(test$BsmtExposure, levels=c(levels(test$BsmtExposure), "No Bsmt"))
test$BsmtFinType1 = factor(test$BsmtFinType1, levels=c(levels(test$BsmtFinType1), "No Bsmt"))
test$BsmtFinType2 = factor(test$BsmtFinType2, levels=c(levels(test$BsmtFinType2), "No Bsmt"))
test$FireplaceQu = factor(test$FireplaceQu, levels=c(levels(test$FireplaceQu), "No Fireplace"))
test$GarageType = factor(test$GarageType, levels=c(levels(test$GarageType), "No Garage"))
test$GarageFinish = factor(test$GarageFinish, levels=c(levels(test$GarageFinish), "No Garage"))
test$GarageQual = factor(test$GarageQual, levels=c(levels(test$GarageQual), "No Garage"))
test$GarageCond = factor(test$GarageCond, levels=c(levels(test$GarageCond), "No Garage"))
test$PoolQC= factor(test$PoolQC, levels=c(levels(test$PoolQC), "No Pool"))
test$Fence= factor(test$Fence, levels=c(levels(test$Fence), "No Fence"))
test$MiscFeature= factor(test$MiscFeature, levels=c(levels(test$MiscFeature), "None"))
#convert all NA's to values
test$Alley[is.na(test$Alley)] = "No Alley Access"
test$BsmtQual[is.na(test$BsmtQual)] = "No Bsmt"
test$BsmtCond[is.na(test$BsmtCond)] = "No Bsmt"
test$BsmtExposure[is.na(test$BsmtExposure)] = "No Bsmt"
test$BsmtFinType1[is.na(test$BsmtFinType1)] = "No Bsmt"
test$BsmtFinType2[is.na(test$BsmtFinType2)] = "No Bsmt"
test$FireplaceQu[is.na(test$FireplaceQu)] = "No Fireplace"
test$GarageType[is.na(test$GarageType)] = "No Garage"
test$GarageFinish[is.na(test$GarageFinish)] = "No Garage"
test$GarageQual[is.na(test$GarageQual)] = "No Garage"
test$GarageCond[is.na(test$GarageCond)] = "No Garage"
test$PoolQC[is.na(test$PoolQC)] = "No Pool"
test$Fence[is.na(test$Fence)] = "No Fence"
test$MiscFeature[is.na(test$MiscFeature)] = "None"
test$BsmtFinSF2[is.na(test$BsmtFinSF2)] = 0
test$BsmtFullBath[is.na(test$BsmtFullBath)] = 0
test$BsmtHalfBath[is.na(test$BsmtHalfBath)] = 0
test$BsmtUnfSF[is.na(test$BsmtUnfSF)] = 0
test$TotalBsmtSF[is.na(test$TotalBsmtSF)] = 0
test$BsmtFinSF1[is.na(test$BsmtFinSF1)] = 0
test$GarageCars[is.na(test$GarageCars)] = 0
test$GarageArea[is.na(test$GarageArea)] = 0
# GarageYrBlt cannot be > 2020
test %>%
filter(GarageYrBlt > 2020) #property id 2593 has GarageYrBlt > 2020, row 1133
## Id MSSubClass MSZoning LotFrontage LotArea Street Alley LotShape
## 1 2593 20 RL 68 8298 Pave No Alley Access IR1
## LandContour Utilities LotConfig LandSlope Neighborhood Condition1 Condition2
## 1 HLS AllPub Inside Gtl Timber Norm Norm
## BldgType HouseStyle OverallQual OverallCond YearBuilt YearRemodAdd RoofStyle
## 1 1Fam 1Story 8 5 2006 2007 Hip
## RoofMatl Exterior1st Exterior2nd MasVnrType MasVnrArea ExterQual ExterCond
## 1 CompShg VinylSd VinylSd <NA> NA Gd TA
## Foundation BsmtQual BsmtCond BsmtExposure BsmtFinType1 BsmtFinSF1
## 1 PConc Gd TA Av GLQ 583
## BsmtFinType2 BsmtFinSF2 BsmtUnfSF TotalBsmtSF Heating HeatingQC CentralAir
## 1 Unf 0 963 1546 GasA Ex Y
## Electrical X1stFlrSF X2ndFlrSF LowQualFinSF GrLivArea BsmtFullBath
## 1 SBrkr 1564 0 0 1564 0
## BsmtHalfBath FullBath HalfBath BedroomAbvGr KitchenAbvGr KitchenQual
## 1 0 2 0 2 1 Ex
## TotRmsAbvGrd Functional Fireplaces FireplaceQu GarageType GarageYrBlt
## 1 6 Typ 1 Gd Attchd 2207
## GarageFinish GarageCars GarageArea GarageQual GarageCond PavedDrive
## 1 RFn 2 502 TA TA Y
## WoodDeckSF OpenPorchSF EnclosedPorch X3SsnPorch ScreenPorch PoolArea PoolQC
## 1 132 0 0 0 0 0 No Pool
## Fence MiscFeature MiscVal MoSold YrSold SaleType SaleCondition
## 1 No Fence None 0 9 2007 New Partial
# Fill it with YearBuilt of that house
test[1133, 'GarageYrBlt'] <- test[1133, 'YearBuilt']
#YearBuilt, change to age, remove the original column
test <- test %>%
dplyr::mutate(Age = 2020 - YearBuilt) %>%
dplyr::select(-YearBuilt)
#GarageYrBlt change to GarageAge
test <- test %>%
dplyr::mutate(GarageAge = 2020 - GarageYrBlt) %>%
dplyr::select(-GarageYrBlt)
#YearRemodAdd, change to difference between 2020 and YearRemodAdd
test <- test %>%
dplyr::mutate(YearRemodAdd = 2020 - YearRemodAdd)
test$FullBath <- factor(test$FullBath)
train$FullBath <- factor(train$FullBath)
levels(test$FullBath)
## [1] "0" "1" "2" "3" "4"
levels(train$FullBath)
## [1] "0" "1" "2" "3"
test$HalfBath <- factor(test$HalfBath)
train$HalfBath <- factor(train$HalfBath)
levels(test$HalfBath)
## [1] "0" "1" "2"
levels(train$HalfBath)
## [1] "0" "1" "2"
test$GarageCars <- factor(test$GarageCars )
train$GarageCars <- factor(train$GarageCars )
levels(test$GarageCars )
## [1] "0" "1" "2" "3" "4" "5"
levels(train$GarageCars )
## [1] "0" "1" "2" "3" "4"
tr_dummy <- dummyVars(log(SalePrice) ~ ., fullRank = T, data = train) %>%
predict(train)
test$SalePrice <- NA
#omit utilities column as it only has one factor variable
test <- test[,-10]
te_dummy <- dummyVars(log(SalePrice) ~ ., fullRank = T, data = test) %>%
predict(test)
# Find common columns
(columns <- intersect(names(data.frame(te_dummy)), names(data.frame(tr_dummy))))
## [1] "Id" "MSSubClass"
## [3] "MSZoning.FV" "MSZoning.RH"
## [5] "MSZoning.RL" "MSZoning.RM"
## [7] "LotFrontage" "LotArea"
## [9] "Street.Pave" "Alley.Pave"
## [11] "Alley.No.Alley.Access" "LotShape.IR2"
## [13] "LotShape.IR3" "LotShape.Reg"
## [15] "LandContour.HLS" "LandContour.Low"
## [17] "LandContour.Lvl" "LotConfig.CulDSac"
## [19] "LotConfig.FR2" "LotConfig.FR3"
## [21] "LotConfig.Inside" "LandSlope.Mod"
## [23] "LandSlope.Sev" "Neighborhood.Blueste"
## [25] "Neighborhood.BrDale" "Neighborhood.BrkSide"
## [27] "Neighborhood.ClearCr" "Neighborhood.CollgCr"
## [29] "Neighborhood.Crawfor" "Neighborhood.Edwards"
## [31] "Neighborhood.Gilbert" "Neighborhood.IDOTRR"
## [33] "Neighborhood.MeadowV" "Neighborhood.Mitchel"
## [35] "Neighborhood.NAmes" "Neighborhood.NoRidge"
## [37] "Neighborhood.NPkVill" "Neighborhood.NridgHt"
## [39] "Neighborhood.NWAmes" "Neighborhood.OldTown"
## [41] "Neighborhood.Sawyer" "Neighborhood.SawyerW"
## [43] "Neighborhood.Somerst" "Neighborhood.StoneBr"
## [45] "Neighborhood.SWISU" "Neighborhood.Timber"
## [47] "Neighborhood.Veenker" "Condition1.Feedr"
## [49] "Condition1.Norm" "Condition1.PosA"
## [51] "Condition1.PosN" "Condition1.RRAe"
## [53] "Condition1.RRAn" "Condition1.RRNe"
## [55] "Condition1.RRNn" "Condition2.Feedr"
## [57] "Condition2.Norm" "Condition2.PosA"
## [59] "Condition2.PosN" "BldgType.2fmCon"
## [61] "BldgType.Duplex" "BldgType.Twnhs"
## [63] "BldgType.TwnhsE" "HouseStyle.1.5Unf"
## [65] "HouseStyle.1Story" "HouseStyle.2.5Unf"
## [67] "HouseStyle.2Story" "HouseStyle.SFoyer"
## [69] "HouseStyle.SLvl" "OverallQual"
## [71] "OverallCond" "YearRemodAdd"
## [73] "RoofStyle.Gable" "RoofStyle.Gambrel"
## [75] "RoofStyle.Hip" "RoofStyle.Mansard"
## [77] "RoofStyle.Shed" "RoofMatl.Tar.Grv"
## [79] "RoofMatl.WdShake" "RoofMatl.WdShngl"
## [81] "Exterior1st.AsphShn" "Exterior1st.BrkComm"
## [83] "Exterior1st.BrkFace" "Exterior1st.CBlock"
## [85] "Exterior1st.CemntBd" "Exterior1st.HdBoard"
## [87] "Exterior1st.MetalSd" "Exterior1st.Plywood"
## [89] "Exterior1st.Stucco" "Exterior1st.VinylSd"
## [91] "Exterior1st.Wd.Sdng" "Exterior1st.WdShing"
## [93] "Exterior2nd.AsphShn" "Exterior2nd.Brk.Cmn"
## [95] "Exterior2nd.BrkFace" "Exterior2nd.CBlock"
## [97] "Exterior2nd.CmentBd" "Exterior2nd.HdBoard"
## [99] "Exterior2nd.ImStucc" "Exterior2nd.MetalSd"
## [101] "Exterior2nd.Plywood" "Exterior2nd.Stone"
## [103] "Exterior2nd.Stucco" "Exterior2nd.VinylSd"
## [105] "Exterior2nd.Wd.Sdng" "Exterior2nd.Wd.Shng"
## [107] "MasVnrType.BrkFace" "MasVnrType.None"
## [109] "MasVnrType.Stone" "MasVnrArea"
## [111] "ExterQual.Fa" "ExterQual.Gd"
## [113] "ExterQual.TA" "ExterCond.Fa"
## [115] "ExterCond.Gd" "ExterCond.Po"
## [117] "ExterCond.TA" "Foundation.CBlock"
## [119] "Foundation.PConc" "Foundation.Slab"
## [121] "Foundation.Stone" "Foundation.Wood"
## [123] "BsmtQual.Fa" "BsmtQual.Gd"
## [125] "BsmtQual.TA" "BsmtQual.No.Bsmt"
## [127] "BsmtCond.Gd" "BsmtCond.Po"
## [129] "BsmtCond.TA" "BsmtCond.No.Bsmt"
## [131] "BsmtExposure.Gd" "BsmtExposure.Mn"
## [133] "BsmtExposure.No" "BsmtExposure.No.Bsmt"
## [135] "BsmtFinType1.BLQ" "BsmtFinType1.GLQ"
## [137] "BsmtFinType1.LwQ" "BsmtFinType1.Rec"
## [139] "BsmtFinType1.Unf" "BsmtFinType1.No.Bsmt"
## [141] "BsmtFinSF1" "BsmtFinType2.BLQ"
## [143] "BsmtFinType2.GLQ" "BsmtFinType2.LwQ"
## [145] "BsmtFinType2.Rec" "BsmtFinType2.Unf"
## [147] "BsmtFinType2.No.Bsmt" "BsmtFinSF2"
## [149] "BsmtUnfSF" "TotalBsmtSF"
## [151] "Heating.GasW" "Heating.Grav"
## [153] "Heating.Wall" "HeatingQC.Fa"
## [155] "HeatingQC.Gd" "HeatingQC.Po"
## [157] "HeatingQC.TA" "CentralAir.Y"
## [159] "Electrical.FuseF" "Electrical.FuseP"
## [161] "Electrical.SBrkr" "X1stFlrSF"
## [163] "X2ndFlrSF" "LowQualFinSF"
## [165] "GrLivArea" "BsmtFullBath"
## [167] "BsmtHalfBath" "FullBath.1"
## [169] "FullBath.2" "FullBath.3"
## [171] "HalfBath.1" "HalfBath.2"
## [173] "BedroomAbvGr" "KitchenAbvGr"
## [175] "KitchenQual.Fa" "KitchenQual.Gd"
## [177] "KitchenQual.TA" "TotRmsAbvGrd"
## [179] "Functional.Maj2" "Functional.Min1"
## [181] "Functional.Min2" "Functional.Mod"
## [183] "Functional.Sev" "Functional.Typ"
## [185] "Fireplaces" "FireplaceQu.Fa"
## [187] "FireplaceQu.Gd" "FireplaceQu.Po"
## [189] "FireplaceQu.TA" "FireplaceQu.No.Fireplace"
## [191] "GarageType.Attchd" "GarageType.Basment"
## [193] "GarageType.BuiltIn" "GarageType.CarPort"
## [195] "GarageType.Detchd" "GarageType.No.Garage"
## [197] "GarageFinish.RFn" "GarageFinish.Unf"
## [199] "GarageFinish.No.Garage" "GarageCars.1"
## [201] "GarageCars.2" "GarageCars.3"
## [203] "GarageCars.4" "GarageArea"
## [205] "GarageQual.Gd" "GarageQual.Po"
## [207] "GarageQual.TA" "GarageQual.No.Garage"
## [209] "GarageCond.Fa" "GarageCond.Gd"
## [211] "GarageCond.Po" "GarageCond.TA"
## [213] "GarageCond.No.Garage" "PavedDrive.P"
## [215] "PavedDrive.Y" "WoodDeckSF"
## [217] "OpenPorchSF" "EnclosedPorch"
## [219] "X3SsnPorch" "ScreenPorch"
## [221] "PoolArea" "PoolQC.Gd"
## [223] "PoolQC.No.Pool" "Fence.GdWo"
## [225] "Fence.MnPrv" "Fence.MnWw"
## [227] "Fence.No.Fence" "MiscFeature.Othr"
## [229] "MiscFeature.Shed" "MiscFeature.None"
## [231] "MiscVal" "MoSold"
## [233] "YrSold" "SaleType.Con"
## [235] "SaleType.ConLD" "SaleType.ConLI"
## [237] "SaleType.ConLw" "SaleType.CWD"
## [239] "SaleType.New" "SaleType.Oth"
## [241] "SaleType.WD" "SaleCondition.AdjLand"
## [243] "SaleCondition.Alloca" "SaleCondition.Family"
## [245] "SaleCondition.Normal" "SaleCondition.Partial"
## [247] "Age" "GarageAge"
# subset based on common columns
te_dummy <- te_dummy %>%
data.frame() %>%
dplyr::select(columns)
## Note: Using an external vector in selections is ambiguous.
## ℹ Use `all_of(columns)` instead of `columns` to silence this message.
## ℹ See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
## This message is displayed once per session.
tr_dummy <- tr_dummy %>%
data.frame() %>%
dplyr::select(columns)
# Impute missings on the test set
clean_te_dummy <- preProcess(te_dummy, "medianImpute") %>%
predict(te_dummy)
all(complete.cases(clean_te_dummy))
## [1] TRUE
# Impute missings on the train set
clean_tr_dummy <- preProcess(tr_dummy, "medianImpute") %>%
predict(tr_dummy)
all(complete.cases(clean_tr_dummy))
## [1] TRUE
# combine clean train and test dummy
all_dummy <- rbind(clean_tr_dummy, clean_te_dummy)
# get variables that might need to be log transformed (skewness > 0.5)
trans <- ((skewness(all_dummy) %>% abs()) > 0.5) %>% which()
# get variables that log transformation improves their skewness
all_dummy_temp <- all_dummy
all_dummy_temp[, trans] <- log(all_dummy[, trans] + 1)
log_index <- (abs(skewness(all_dummy)) - abs(skewness(all_dummy_temp)) > 0.1) %>% which()
# log transform these variables
all_dummy[, log_index] <- log(all_dummy[, log_index] + 1)
# Additional features need to be transformed (ones not at the same scale as others)
# Need to tranform: BsmtUnfSF, TotalBsmtSF, GarageArea, YrSold
all_dummy <- all_dummy %>%
mutate(BsmtUnfSF = log(BsmtUnfSF + 1),
GarageArea = log(GarageArea + 1),
YrSold = log(2020 - YrSold + 1),
TotalBsmtSF = log(TotalBsmtSF + 1))
# split train and test set and look for outliers
train_ready <- all_dummy %>%
filter(Id <= 1460)
train_ready['LogSalePrice'] <- log(train$SalePrice)
test_ready <- all_dummy %>%
filter(Id > 1460) %>%
dplyr::select(-Id)
colnames(train_ready) <- make.names(colnames(train_ready))
colnames(test_ready) <- make.names(colnames(test_ready))
# looking for potential "outliers", plot all the variables against LogSalePrice, 8 plots at a time
i <- 2
while (i <= 249 - 7)
{
par(mfrow = c(2, 4))
# par(mar=c(1, 1, 1, 1))
for (j in i:(i + 7))
{
if (j > 248) break
plot(train_ready[[j]], train_ready$LogSalePrice,
xlab = train_ready[j] %>% names())
}
i <- i + 8
}
# According to these plots, 6 variables may contain some "outliers": OverallCond, LotFrontage, LotArea, X1stFlrSF, GrLivArea, labeld in the figures with Id#
ggplot(train_ready, aes(OverallCond, LogSalePrice)) +
geom_point() +
theme_minimal() +
labs(title = "LogSalePrice & OverallCond") +
geom_text(aes(label = ifelse(LogSalePrice > 12.5 & OverallCond < 2.5, Id, '')),
hjust = 1.3)
ggplot(train_ready, aes(LotFrontage, LogSalePrice, label = Id)) +
geom_point() +
theme_minimal() +
labs(title = "LogSalePrice & LotFrontage") +
geom_text(aes(label = ifelse(LotFrontage > 5.5, Id, '')),
hjust = 1.3)
ggplot(train_ready, aes(LotArea, LogSalePrice, label = Id)) +
geom_point() +
theme_minimal() +
labs(title = "LogSalePrice & LotArea") +
geom_text(aes(label = ifelse(LotArea > 11.5, Id, '')),
hjust = 1.3)
ggplot(train_ready, aes(X1stFlrSF, LogSalePrice, label = Id)) +
geom_point() +
theme_minimal() +
labs(title = "LogSalePrice & X1stFlrSF") +
geom_text(aes(label = ifelse(X1stFlrSF > 8.25, Id, '')),
hjust = 1.3)
ggplot(train_ready, aes(GrLivArea, LogSalePrice, label = Id)) +
geom_point() +
theme_minimal() +
labs(title = "LogSalePrice & GrLivArea") +
geom_text(aes(label = ifelse(LogSalePrice < 12.5 & GrLivArea > 8.2, Id, '')),
hjust = 1.3)
## remove these outliners from training set
train_ready <- train_ready %>%
filter(!Id %in% c(379, 935, 1299, 707, 250, 336, 314, 524)) %>%
dplyr::select(-Id)
# now the train and test sets are ready for model fit
dim(train_ready)
## [1] 1452 248
dim(test_ready)
## [1] 1459 247
set.seed(123)
myControl = trainControl(method = "repeatedcv",
number = 20,
repeats = 10)
train_ready1 <- train_ready %>%
dplyr::select(-LogSalePrice)
model <- train(y = train_ready$LogSalePrice,
x = train_ready1,
method = "glmnet",
preProcess = c("center","scale"),
trControl = myControl)
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, PoolQC.Gd
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, PoolQC.Gd
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, PoolQC.Gd
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, Electrical.FuseP
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, Electrical.FuseP
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, Electrical.FuseP
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: RoofStyle.Shed, Exterior1st.BrkComm
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: RoofStyle.Shed, Exterior1st.BrkComm
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: RoofStyle.Shed, Exterior1st.BrkComm
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN,
## PoolQC.Gd
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN,
## PoolQC.Gd
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN,
## PoolQC.Gd
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition1.RRNe
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition1.RRNe
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition1.RRNe
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA, Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock, ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.AsphShn,
## Exterior1st.CBlock, Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.AsphShn,
## Exterior1st.CBlock, Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.AsphShn,
## Exterior1st.CBlock, Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po, Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po, Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po, Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN, Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosA
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: HeatingQC.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Exterior1st.AsphShn
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Condition2.PosN
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19,
## uniqueCut = 10, : These variables have zero variances: Exterior1st.CBlock,
## Exterior2nd.CBlock
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: Functional.Sev
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: ExterCond.Po
model
## glmnet
##
## 1452 samples
## 247 predictor
##
## Pre-processing: centered (247), scaled (247)
## Resampling: Cross-Validated (20 fold, repeated 10 times)
## Summary of sample sizes: 1379, 1380, 1380, 1379, 1379, 1380, ...
## Resampling results across tuning parameters:
##
## alpha lambda RMSE Rsquared MAE
## 0.10 0.0006555019 0.1153852 0.9169069 0.07967681
## 0.10 0.0065550192 0.1139768 0.9188340 0.07932931
## 0.10 0.0655501923 0.1186729 0.9150325 0.08352369
## 0.55 0.0006555019 0.1135127 0.9194134 0.07869450
## 0.55 0.0065550192 0.1137109 0.9193739 0.07980304
## 0.55 0.0655501923 0.1524831 0.8760187 0.10738070
## 1.00 0.0006555019 0.1131413 0.9199141 0.07847729
## 1.00 0.0065550192 0.1159509 0.9166443 0.08144864
## 1.00 0.0655501923 0.1825345 0.8322606 0.13267675
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were alpha = 1 and lambda = 0.0006555019.
# in-sample RMSE
rmse <- function(actual, fitted) sqrt(mean((actual - fitted)^2))
rmse(train_ready$LogSalePrice, fitted(model))
## [1] 0.0931256
# in-sample R-square
y_predicted <- predict(model, s = opt_lambda, newx = x)
# Sum of Squares Total and Error
sst <- sum((train_ready$LogSalePrice - mean(train_ready$LogSalePrice))^2)
sse <- sum((y_predicted - train_ready$LogSalePrice)^2)
# R squared
rsq <- 1 - sse / sst
rsq
## [1] 0.945377
varImp(model)
## glmnet variable importance
##
## only 20 most important variables shown (out of 247)
##
## Overall
## GrLivArea 100.00
## MSZoning.RL 69.99
## Age 56.42
## MSZoning.RM 52.19
## OverallQual 46.63
## MSZoning.FV 40.45
## TotalBsmtSF 37.97
## OverallCond 36.89
## LotArea 32.83
## KitchenQual.Gd 24.35
## BsmtFinSF1 23.85
## KitchenQual.TA 22.16
## X1stFlrSF 21.08
## SaleType.New 19.26
## MSZoning.RH 18.37
## SaleCondition.Normal 18.35
## GarageArea 17.23
## Neighborhood.Crawfor 16.99
## Condition1.Norm 16.35
## BsmtQual.TA 14.52
#Coeficient
coef(model$finalModel, model$finalModel$tuneValue$lambda)
## 248 x 1 sparse Matrix of class "dgCMatrix"
## 1
## (Intercept) 1.202162e+01
## MSSubClass -3.991044e-03
## MSZoning.FV 5.113478e-02
## MSZoning.RH 2.321794e-02
## MSZoning.RL 8.846763e-02
## MSZoning.RM 6.597100e-02
## LotFrontage 5.868957e-03
## LotArea 4.150094e-02
## Street.Pave 6.708622e-03
## Alley.Pave 3.041393e-03
## Alley.No.Alley.Access .
## LotShape.IR2 1.278005e-03
## LotShape.IR3 .
## LotShape.Reg 1.405075e-03
## LandContour.HLS 1.551400e-04
## LandContour.Low -2.697715e-03
## LandContour.Lvl 5.586666e-04
## LotConfig.CulDSac 4.979968e-03
## LotConfig.FR2 -5.907982e-03
## LotConfig.FR3 -2.784848e-03
## LotConfig.Inside -4.837638e-03
## LandSlope.Mod 2.881985e-03
## LandSlope.Sev -6.271000e-03
## Neighborhood.Blueste 7.741745e-04
## Neighborhood.BrDale 1.774698e-03
## Neighborhood.BrkSide 8.630767e-03
## Neighborhood.ClearCr 5.878983e-03
## Neighborhood.CollgCr .
## Neighborhood.Crawfor 2.148244e-02
## Neighborhood.Edwards -1.248026e-02
## Neighborhood.Gilbert -2.098065e-03
## Neighborhood.IDOTRR -4.428967e-03
## Neighborhood.MeadowV -5.588257e-03
## Neighborhood.Mitchel -7.152946e-03
## Neighborhood.NAmes -2.567543e-04
## Neighborhood.NoRidge 1.521512e-02
## Neighborhood.NPkVill 3.484298e-03
## Neighborhood.NridgHt 1.356846e-02
## Neighborhood.NWAmes -3.365189e-03
## Neighborhood.OldTown -5.202152e-03
## Neighborhood.Sawyer .
## Neighborhood.SawyerW 2.271896e-04
## Neighborhood.Somerst 4.816009e-03
## Neighborhood.StoneBr 1.584397e-02
## Neighborhood.SWISU 1.555459e-03
## Neighborhood.Timber .
## Neighborhood.Veenker 2.035590e-03
## Condition1.Feedr 3.099270e-03
## Condition1.Norm 2.067201e-02
## Condition1.PosA 1.189702e-04
## Condition1.PosN 7.819709e-03
## Condition1.RRAe -6.249307e-03
## Condition1.RRAn 2.200276e-03
## Condition1.RRNe .
## Condition1.RRNn 4.852167e-03
## Condition2.Feedr 6.999054e-04
## Condition2.Norm 1.682175e-03
## Condition2.PosA 4.868347e-03
## Condition2.PosN -1.434482e-03
## BldgType.2fmCon .
## BldgType.Duplex -1.313686e-03
## BldgType.Twnhs -1.340000e-03
## BldgType.TwnhsE .
## HouseStyle.1.5Unf 4.073782e-03
## HouseStyle.1Story .
## HouseStyle.2.5Unf .
## HouseStyle.2Story .
## HouseStyle.SFoyer 3.219409e-03
## HouseStyle.SLvl -7.727582e-06
## OverallQual 5.893863e-02
## OverallCond 4.663728e-02
## YearRemodAdd -1.226847e-02
## RoofStyle.Gable -2.047457e-03
## RoofStyle.Gambrel -7.821124e-04
## RoofStyle.Hip .
## RoofStyle.Mansard 2.468075e-03
## RoofStyle.Shed 2.279823e-03
## RoofMatl.Tar.Grv -9.137340e-05
## RoofMatl.WdShake 3.242515e-04
## RoofMatl.WdShngl 8.900862e-03
## Exterior1st.AsphShn -1.257890e-03
## Exterior1st.BrkComm -5.960156e-03
## Exterior1st.BrkFace 1.664689e-02
## Exterior1st.CBlock -2.675509e-03
## Exterior1st.CemntBd .
## Exterior1st.HdBoard -1.097829e-03
## Exterior1st.MetalSd 2.591397e-03
## Exterior1st.Plywood .
## Exterior1st.Stucco 1.258306e-03
## Exterior1st.VinylSd .
## Exterior1st.Wd.Sdng -5.511834e-03
## Exterior1st.WdShing .
## Exterior2nd.AsphShn .
## Exterior2nd.Brk.Cmn .
## Exterior2nd.BrkFace -7.227777e-03
## Exterior2nd.CBlock -1.051713e-05
## Exterior2nd.CmentBd 2.186333e-03
## Exterior2nd.HdBoard .
## Exterior2nd.ImStucc 1.086262e-03
## Exterior2nd.MetalSd .
## Exterior2nd.Plywood -1.990209e-03
## Exterior2nd.Stone -2.918144e-03
## Exterior2nd.Stucco .
## Exterior2nd.VinylSd .
## Exterior2nd.Wd.Sdng 1.416899e-03
## Exterior2nd.Wd.Shng -7.764283e-05
## MasVnrType.BrkFace 2.586814e-04
## MasVnrType.None .
## MasVnrType.Stone 2.824442e-03
## MasVnrArea .
## ExterQual.Fa 2.960055e-03
## ExterQual.Gd .
## ExterQual.TA -2.472109e-03
## ExterCond.Fa -4.425156e-03
## ExterCond.Gd -3.568844e-03
## ExterCond.Po -1.325599e-03
## ExterCond.TA .
## Foundation.CBlock 6.942063e-03
## Foundation.PConc 1.318534e-02
## Foundation.Slab 8.606037e-04
## Foundation.Stone 4.131372e-03
## Foundation.Wood -3.764061e-03
## BsmtQual.Fa -3.774962e-03
## BsmtQual.Gd -1.637289e-02
## BsmtQual.TA -1.835797e-02
## BsmtQual.No.Bsmt 1.406334e-02
## BsmtCond.Gd 1.711089e-04
## BsmtCond.Po 4.216582e-04
## BsmtCond.TA 4.055845e-03
## BsmtCond.No.Bsmt 3.001244e-04
## BsmtExposure.Gd 1.526351e-02
## BsmtExposure.Mn .
## BsmtExposure.No -4.382063e-03
## BsmtExposure.No.Bsmt .
## BsmtFinType1.BLQ -4.002062e-03
## BsmtFinType1.GLQ 6.518093e-04
## BsmtFinType1.LwQ -5.124648e-03
## BsmtFinType1.Rec -4.470197e-03
## BsmtFinType1.Unf .
## BsmtFinType1.No.Bsmt 7.304471e-05
## BsmtFinSF1 3.015157e-02
## BsmtFinType2.BLQ -4.454012e-03
## BsmtFinType2.GLQ 3.839595e-03
## BsmtFinType2.LwQ -1.444193e-03
## BsmtFinType2.Rec -1.917834e-03
## BsmtFinType2.Unf -1.533640e-03
## BsmtFinType2.No.Bsmt 1.143727e-02
## BsmtFinSF2 .
## BsmtUnfSF -7.913153e-03
## TotalBsmtSF 4.800037e-02
## Heating.GasW 4.788652e-03
## Heating.Grav -1.026162e-02
## Heating.Wall 4.766313e-04
## HeatingQC.Fa -1.580712e-03
## HeatingQC.Gd -5.848911e-03
## HeatingQC.Po .
## HeatingQC.TA -1.345561e-02
## CentralAir.Y 1.342673e-02
## Electrical.FuseF 1.375978e-03
## Electrical.FuseP -1.493356e-03
## Electrical.SBrkr -1.380602e-03
## X1stFlrSF 2.664419e-02
## X2ndFlrSF .
## LowQualFinSF -2.863179e-03
## GrLivArea 1.264087e-01
## BsmtFullBath 1.245455e-02
## BsmtHalfBath 3.167602e-04
## FullBath.1 .
## FullBath.2 5.401792e-03
## FullBath.3 1.326544e-02
## HalfBath.1 1.170164e-02
## HalfBath.2 -5.401237e-04
## BedroomAbvGr -2.609019e-03
## KitchenAbvGr -9.919043e-03
## KitchenQual.Fa -6.875051e-03
## KitchenQual.Gd -3.077885e-02
## KitchenQual.TA -2.801235e-02
## TotRmsAbvGrd .
## Functional.Maj2 -1.136221e-02
## Functional.Min1 .
## Functional.Min2 .
## Functional.Mod -2.346099e-03
## Functional.Sev -6.186916e-03
## Functional.Typ 1.611680e-02
## Fireplaces 1.468716e-02
## FireplaceQu.Fa -1.996970e-03
## FireplaceQu.Gd .
## FireplaceQu.Po 1.867342e-07
## FireplaceQu.TA .
## FireplaceQu.No.Fireplace .
## GarageType.Attchd .
## GarageType.Basment -3.938209e-03
## GarageType.BuiltIn 3.254108e-04
## GarageType.CarPort -4.083287e-03
## GarageType.Detchd 3.969795e-03
## GarageType.No.Garage .
## GarageFinish.RFn .
## GarageFinish.Unf -8.560804e-04
## GarageFinish.No.Garage .
## GarageCars.1 -6.606109e-03
## GarageCars.2 .
## GarageCars.3 1.553919e-02
## GarageCars.4 7.914046e-03
## GarageArea 2.177616e-02
## GarageQual.Gd 1.841943e-03
## GarageQual.Po 1.953065e-05
## GarageQual.TA 7.025446e-05
## GarageQual.No.Garage .
## GarageCond.Fa -8.502955e-03
## GarageCond.Gd 2.078640e-05
## GarageCond.Po 9.254211e-04
## GarageCond.TA .
## GarageCond.No.Garage .
## PavedDrive.P -1.215132e-03
## PavedDrive.Y 3.279317e-03
## WoodDeckSF 8.051286e-03
## OpenPorchSF 2.610796e-03
## EnclosedPorch 2.217023e-03
## X3SsnPorch 1.335480e-03
## ScreenPorch 1.059326e-02
## PoolArea 5.044351e-03
## PoolQC.Gd 5.878457e-03
## PoolQC.No.Pool .
## Fence.GdWo -5.222105e-03
## Fence.MnPrv 2.403366e-04
## Fence.MnWw -1.672652e-03
## Fence.No.Fence .
## MiscFeature.Othr -2.908653e-03
## MiscFeature.Shed .
## MiscFeature.None .
## MiscVal -1.273056e-03
## MoSold -8.850514e-04
## YrSold 2.579694e-03
## SaleType.Con 2.640037e-03
## SaleType.ConLD 5.596314e-03
## SaleType.ConLI .
## SaleType.ConLw 3.446600e-04
## SaleType.CWD 2.836195e-03
## SaleType.New 2.434819e-02
## SaleType.Oth 2.841873e-03
## SaleType.WD -1.321595e-03
## SaleCondition.AdjLand 4.078395e-03
## SaleCondition.Alloca 1.132249e-03
## SaleCondition.Family .
## SaleCondition.Normal 2.319710e-02
## SaleCondition.Partial .
## Age -7.131935e-02
## GarageAge -2.961980e-03
predictions <- predict(model, test_ready)
head(predictions)
## [1] 11.69230 11.97605 12.14106 12.19673 12.19166 12.04807
price_prediction <- data.frame(Id = test$Id,
SalePrice = exp(predictions))
head(price_prediction)
## Id SalePrice
## 1 1461 119647.1
## 2 1462 158902.4
## 3 1463 187411.2
## 4 1464 198139.3
## 5 1465 197139.0
## 6 1466 170768.7
all(complete.cases(price_prediction))
## [1] TRUE
# Export your prediction data.frame as a .csv file.
write.csv(price_prediction, "price_prediction_0424.csv")